Dynamic Control with Actor-Critic Reinforcement Learning

نویسنده

Reinaldo A Uribe

چکیده

4 Actor-Critic Marble Control 4 4.1 R-code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.2 The critic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.3 Unstable actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.4 Trading off stability against optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

An online adaptive reinforcement learning-based solution is developed for the infinite-horizon optimal control problem for continuous-time uncertain nonlinear systems. A novel actor–critic–identifier (ACI) is proposed to approximate the Hamilton–Jacobi–Bellman equation using three neural network (NN) structures—actor and critic NNs approximate the optimal control and the optimal value function,...

متن کامل

Call Admission Control in Wireless Ds-cdma Systems Using Reinforcement Learning

THAI) สาขาวิชาวิศวกรรมโทรคมนาคม ลายมือช่ือนักศึกษา ปการศึกษา 2549 ลายมือช่ืออาจารยที่ปรึกษา PITIPONG CHANLOHA : CALL ADMISSION CONTROL IN WIRELESS DS-CDMA SYSTEMS USING REINFORCEMENT LEARNING. THESIS ADVISOR : ASST. PROF. WIPAWEE HATTAGAM, Ph.D. 95 PP. ABSTRACT (ENGLISH) DIRECT-SEQUENTIAL CODE DIVISION MULTIPLE ACCESS (DS-CDMA)/ CALL ADMISSION CONTROL/ REINFORCEMENT LEARNING/ ACTOR-CRITIC REINFO...

متن کامل

1 Supervised Actor - Critic Reinforcement Learning

Editor’s Summary: Chapter ?? introduced policy gradients as a way to improve on stochastic search of the policy space when learning. This chapter presents supervised actor-critic reinforcement learning as another method for improving the effectiveness of learning. With this approach, a supervisor adds structure to a learning problem and supervised learning makes that structure part of an actor-...

متن کامل

Incremental Natural Actor-Critic Algorithms

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods...

متن کامل

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the “building blocks of movement generation”, called motor ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Dynamic Control with Actor-Critic Reinforcement Learning

نویسنده

چکیده

منابع مشابه

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

Call Admission Control in Wireless Ds-cdma Systems Using Reinforcement Learning

1 Supervised Actor - Critic Reinforcement Learning

Incremental Natural Actor-Critic Algorithms

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

عنوان ژورنال:

اشتراک گذاری